Draft version September 26, 2008 

Preprint typeset using 1^1^^ style cmulatcapj v. 10/09/06 



oo 
O 
O 
(N 

D 
(N 

6 

ti 



CMB LIKELIHOOD APPROXIMATION BY A GAUSSIANIZED BLACKWELL-RAO ESTIMATOR 
0. RuDjORD^"', N. E. Groeneboom^"', H. K. Eriksen''^'^'^, Greg Huey®, K. M. Gorski'^' '' and J. B. Jewell" 

Draft version September 26, 2008 

ABSTRACT 

We introduce a new CMB temperature likelihood approximation called the Gaussianized Blackwell- 
Rao (GBR) estimator. This estimator is derived by transforming the observed marginal power spec- 
trum distributions obtained by the CMB Gibbs sampler into standard univariate Gaussians, and then 
approximate their joint transformed distribution by a multivariate Gaussian. The method is exact for 
full-sky coverage and uniform noise, and an excellent approximation for sky cuts and scanning patterns 
relevant for modern satellite experiments such as WMAP and Planck. The result is a stable, accurate 
and computationally very efficient CMB temperature likelihood representation that allows the user to 
exploit the unique error propagation capabilities of the Gibbs sampler to high ^'s. A single evaluation 
of this estimator between £ = 2 and 200 takes ~ 0.2 CPU milliseconds, while for comparison, a singe 
pixel space likelihood evaluation between t — 2 and 30 for a map with ~ 2500 pixels requires ~ 20 
seconds. We apply this tool to the 5-year WMAP temperature data, and re-estimate the angular 
temperature power spectrum, Cf, and likelihood, C{C'i), for £ < 200, and derive new cosmological pa- 
rameters for the standard six-parameter ACDM model. Our spectrum is in excellent agreement with 
the official WMAP spectrum, but we find slight differences in the derived cosmological parameters. 
Most importantly, the spectral index of scalar perturbations is rig = 0.973 ± 0.014, 1.9a away from 
unity and 0.6a higher than the official WMAP result, ng = 0.965 ±0.014. This suggests that an exact 
likelihood treatment is required to higher ^'s than previously believed, reinforcing and extending our 
conclusions from the 3-year WMAP analysis. In that case, we found that the sub-optimal likelihood 
approximation adopted between £ = 12 and 30 by the WMAP team biased Ug low by O.ia, while here 
we find that the same approximation between £ = 30 and 200 introduces a bias of 0.6a in rig. 
Subject headings: cosmic microwave background — cosmology: observations — methods: statistical 
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1. INTRODUCTION 

Detailed measurements of fluctuations in the cosmic 
microwave background (CMB) have established cosmol- 
ogy as a high-precision science. One striking illustration 
of this is the fact that it is today possible to predict a 
vast number of observables based on six numbers only, 
with only a few (but nevertheless intriguing) "glitches" 
overall. The key to this success has been making accu- 
rate measurements of the CMB power spectrum, perhaps 
most prominently exemplified by Wilkinson Microwave 
Anisotropy Probe (WMAP; Bennett et al. 2003; Hinshaw 
et al. 2007, Hinshaw et al. 2008). 

The primary connection between theoretical models 
and CMB observations is made through the CMB like- 
lihood, C{Ct) — P(d|Cf). This is a multivariate, non- 
Gaussian function that quantifies the match between the 
data and a given power spectrum, C^. Unfortunately, it 
is impossible to evaluate this function explicitly for mod- 
ern high-resolution data sets, due to the sheer size of the 
problem, and one therefore instead typically resolves to 
various approximations. 

However, given the importance of the CMB in modern 

^ email: oystein.rudjordOastro. uio.no 
^ email: leuat@irio.co.uk (NEC) 
^ email: h.k.k.eriksen@astro.uio.no 

* Institute of Theoretical Astrophysics, University of Oslo, P.O. 
Box 1029 Blindern, N-0315 Oslo, Norway 

^ Centre of Mathematics for Applications, University of Oslo, 
P.O. Box 1053 Blindern, N-0316 Oslo 

® Jet Propulsion Laboratory, California Institute of Technology, 
Pasadena CA 91109 

^ Warsaw University Observatory, Aleje Ujazdowskie 4, 00-478 
Warszawa, Poland 



cosmology, it is of critical importance to characterize this 
likelihood accurately, and all approximations must be 
thoroughly verified. One example is the approximation of 
the large angular scale likelihood, where C{Ci) is strongly 
non-Gaussian. This turned out to be a non-trivial issue 
after the original analys is of t he 3-y ear WMAP temper- 
ature data by h inshaw et al.l (l2007f). in whi ch a Master- 
based (|Hivon et al.„2002l: IVerde et al.ll2003 D approxima- 
tion was used at £ > 12. An exact likelihood analysis 
(jEriksen et al.l l2007b) later demonstrated that this sub- 
optimal approximation, when applied to harmonic modes 
between ^ = 13 and 30, biased the spectral index of scalar 
perturbations, rig, low by 0.4cr. 

A second example is that of non-cosmological fore- 
grounds. Unless properly accounted for, such fore- 
grounds bias the observed power spectrum to high values, 
and can seriously compromise any cosmological conclu- 
sions. While important for temperature observations, 
this is an absolutely crucial issue for polarization obser- 
vations, as the desired CMB in amplitude is comparable 
to or weaker than the interfering foregrounds over most 
of the sky. 

In recent years, a new set of statistical methods have 
been developed that allows the user to add ress these is- 
sues w ithin a single well-defined framework ("J ewell et al.l 
12004 [Wandclt et al. 2004; Erikscn et al. 2003). The 
heart of this method is the Gibbs sampling algorithm 
(see, e.g, Gelfand and Smith 1990), in which samples 
from a (typically complicated) joint distribution are 
drawn by alternately sampling from (simpler) condi- 
tional distributions. In the CMB setting, this is real- 
ized by drawing joint samples from P(s, C^jd), by alter- 
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nately sampling from P{Ci\s,d), where Ce is the CMB 
power spectrum, s is the CMB sky signal, and d are 
the observed data. In addition to allow for exact like- 
lihood analysis at reasonable computational cost, an 
equally important feature of this framework is its unique 
capability of including additional degrees of freedom, 
such as non-cosmologic al foregrounds, into the analysis 
([Eriksen et al.| l2008a','lJ). Further, very recently an ad- 
ditional Metropolis-Ha sting s MCMC sampling step was 
introduced by Jewel l et al.l (|2008D . that effectively re- 
solves the previously describe d inefficiency of the Gibbs 
sampler at low signal-to-noise (|Eriksen et al.ll2004f l. The 
framework has also been extended to handle polarization 
(|Larson et al.| [2007: Eriksen ct al. 2007b) and anisotropic 
universe models (Grocneboom fc Eriksenll2008f ). 

By now, the CMB Gibbs sampler is well established 
and demonstrated to sample efficiently from the exact 
CMB posterior. However, a long-standing issue has been 
the characterization of the joint likeliho od, given a set 
of such samples. Originally, Wandc lt et al.l ()2004l ) pro- 
posed to use the so-called Blackwell-Rao (BR) estima- 
tor for this purpose, and this appr oach was la t er im - 
plemented and studied in detail by IChu et ahl (|2005( ). 
While highly accurate for the large angular scale and high 
signal-to-noise temperature likelihood, it suffers from one 
major drawback: Because it attempts to describe the full 
^max-dimensional likelihood without any constraints on 
allowed correlations, the number of samples required for 
convergence scales exponentially with fmax- In practice, 
this limits the BR estimator to ^ < 30 for temperature 
data, and just ^ ^ 3 — 4 for low signal-to-noise polariza- 
tion data. 

In this paper, we introduce a new temperature likeli- 
hood approximation based on samples drawn from the 
CMB posterior, by modifying the original BR estimator 
in a way that restricts the allowed A^-point functions of 
C{C(), but still captures most of the relevant informa- 
tion. Explicitly, this is done through a specific change 
of variables, such that the observed marginal posterior 
for each multipole, P(Cf |d), is transformed into a Gaus- 
sian. Then, in these new variables the joint distribution 
is approximated by a multivariate Gaussian. As long as 
the correlation between any two multipoles is reasonably 
small, as is the case for nearly full-sky experiments such 
as WMAP and Planck, we shall see that this provides 
an excellent approximation to the exact joint likelihood. 
As a result, the new approach greatly reduces the overall 
number of samples required for convergence, and allows 
us to obtain a highly accurate likelihood approximation 
to arbitrary £niax- Generalization to a full polarized like- 
lihood will be discussed in a future paper (Eriksen et al. , 
in preparation). 

This paper is organized as follows: In we first briefly 
review the Gibbs sampling algorithm together with the 
original Blackwell-Rao estimator, and in ^J3]we introduce 
the new Gaussianized Blackwell-Rao estimator. Next, in 
21 we apply the new estimator to simulated data, and 
compare results with brute-force likelihood evaluations 
in pixel space. In fJSl we analyze the 5-year WMAP tem- 
perature data, and provide an updated power spectrum 
and set of cosmological parameters. We summarize and 
conclude in ^JS) 

2. REVIEW OF THE CMB GIBBS SAMPLER 



We start by reviewing the current state of the 
CMB Gibbs sampling framework, a s previously devel- 
oped through a series of papers (iJewell et al.l 12004 
Wandelt et al] ^2 004: Eriksen et al.l l2004l : iLarson et all 
2007t lEriksen et all I2008af) . and highhght the problem 
of likelihood modelling as currently presented in the lit- 
erature. 

2.1. Elementary CMB Gibbs sampling 

First, we assume that our observations, d, in direction 
n may be modelled in terms of a signal, s and a noise, n, 
component, 

d{n) ^ s{fi) + n{h) . (1) 

Further, we assume that both s and n are Gaussian dis- 
tributed with vanishing mean and covariances S and N, 
respectively. The CMB is in this paper additionally as- 
sumed to be isotropic, such that in spherical harmonic 
space (s(n) — J2e m^<^"i^imifi)) the CMB covariance 
matrix may be written as Sim,tm' = CiSwSmm' , where 
Ci = {aimOLim) is the angular power spectrum. Our 
goal is now to map out the CMB posterior distribution 
P(s,C£|d) and the CMB likelihood C{Ci,) = P{d\Ci). 
Note that we in this paper are concerned with the prob- 
lem of likelihood characterization only, which is a post- 
processing step relative to the Gibbs sampler. For nota- 
tional transparency, we therefore neglect issues such as 
foreground marginalization, instrumental beams, multi- 
frequency observations etc. For full details on these is- 
sues, see, e.g., Eriksen et al. 2008a. 

When working with real-world CMB data, there are a 
number of issues that complicate the analysis. Two im- 
portant examples are anisotropic noise and Galactic fore- 
grounds. First, because of the scanning motion of a CMB 
satellite, the pixels in a given data set are observed over 
unequal amounts of time. This implies that the effective 
noise is a function of pixel location on the sky. Second, 
large regions of the sky are obscured by Galactic fore- 
grounds (e.g., synchrotron, free- free and dust emission), 
and these regions must be rejected from the analysis by 
masking. 

Because of such issues, the total data covariance ma- 
trix S -|- N is dense in both pixel and harmonic space. 
As a result, it is computationally unfeasible to evalu- 
ate and sample directly from P{s,Ct\d). Fortunately, 
this problem was orig inall y solved by J ewell ct aL, (2004) , 
iWandelt et all (|2004l ) and lEriksenet al. (2004), who de- 
veloped a particular CMB Gibbs sampler for precisely 
this purpose. For full details on this method, we refer 
the interested reader to the original papers, and in the 
following we only describe the main ideas. The practi- 
cal implementation of the algorithm used in this paper 
is called "Commander" , and h as been described in detail 
bv lEriksen eFall pOOl . l2008al ) . 

The idea behind the CMB Gibbs sampler is to draw 
samples from the joint posterior by alternately sampling 
from the two corresponding conditionals. The sampling 
scheme may thus be written on the symbolic form 

s''+i^P(s|Q,d) (2) 
C;+i^F(Q|s*+i,d), (3) 

where the left arrow implies sampling from the distribu- 
tion on the right-hand side. Then, after some burn-in 
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period, (s\ C^) will be drawn from the desired distribu- 
tion. The only remaining step is to write down sampling 
algorithms for each of the two above conditional distribu- 
tions, both of which are readily available for our problem, 
since the former is simply a multivariate Gaussian, and 
the second is a product of independent inverse Gamma 
distributions. For one pos sible general samp ling algo- 
rithm for F(Q|s), see, e.g.. lEriksen fc Wehud (|2008i) . 

2.2. The Blackwell-Rao estimator 

The Gibbs sampler produces a set of samples drawn 
from the joint CMB posterior, P(s, C^jd). However, for 
these samples to be useful for estimation of cosmological 
parameters, we have to transform the information con- 
tained in this sample set into a smooth approximation to 
the likehhood C{Ce) = P{d\Ce). In principle, we could 
simply generate a multi-variate histogram and read off 
corresponding values, but this does not work in prac- 
tice because of the large dimensionality of the parameter 
space. 

In the current literature, the best approach for han- 
dling this problem is the Black\yell-Ra o (BR) estimator 
(jWandelt et al.l 12004 [Chu et al.ll2005l ). which attempts 
to smooth the sampled histogram by taking advantage 
of the known analytic distribution, P{Cg\s): First, we 
define the observed power spectrum, ai, of the current 
CMB sky Gibbs sample, s(n) = J2i m '^tmYimin), 
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Then the BR estimator is derived as follows, 
P{Ci\A) = J P{Ce,s\d)ds 



P{Ci\s,d)P{s\d)ds 
PiCi\ae)Piae\d) Dae 



No 



(5) 



In other words, the BR estimator is nothing but the av- 
erage oi P{Cg\ae) over the sample set, where ae refers to 
the power spectrum of a full-sky noiseless CMB signal 
Gibbs sample. This distribution has a simple analytic 
expression fe.g.. lChu et"all[200l . 



c: 



(6) 



While Equation [5] does constitute a computationally 
convenient and accurate approximation to the full likeli- 
hood for some special applications, it suffers badly from 
poor convergence properties with increasing dimensional- 
ity of the sampled space. This behaviour may be under- 
stood in terms of relative distribution widths: Suppose 
we want to map out an ^max-dimensional distribution, 
and each of the univariate Blackwell-Rao functions [i.e., 
P{Ci\ae)\ have a standard deviation of, say, 90% of the 
corresponding marginal distributions. The total volume 
fraction spanned by a single sample in €,nax dimensions 



is then / = 0.9^"""', an exponentially decreasing function 
with ^max- Therefore it also takes an exponential num- 
ber of samples in order to build up the full histogram, 
and this becomes computationally u nfeasible for real istic 
data sets aheady at ^^ax > 30 - 50 (IChu et al.|[2005f ). 

The main problem with this approach is that one at- 
tempts to map out all possible TV-point correlation func- 
tions between all multipoles. The number of such Ap- 
point functions is obviously overwhelming with increas- 
ing dimensionality. But this also hints at a possible res- 
olution of the problem: We know by experience that the 
CMB likelihood is a reasonably well behaved function, in 
that 1) there are only weak correlations between multi- 
poles for data sets with nearly full-sky coverage, and 2) 
that even including just two-point correlations (in trans- 
formed variables) produces very reasonable results (e.g.. 
Bond, Jaffe & Knox 2000; Verde et al. 2003). This intu- 
ition will be used in the next section to define a stable 
likelihood estimator. 

3. THE GAUSSIANIZED BLACKWELL-RAO ESTIMATOR 

We now introduce a new Gibbs-based likelihood esti- 
mator we call the "Gaussianized Blackwell-Rao estima- 
tor" (GBR). The basic idea behi nd this approach is sim- 
ilar to that em ployed by, e.g., iBond et all (|2000D and 
iHamimeche fc L ewis (2003), namely approximation by a 
multivariate Gaussian in a transformed set of variables. 

Explicitly, our approximation is defined by transform- 
ing the univariate marginal distributions P{Cf,\d) into 
Gaussianized variables, xi, and then assuming a multi- 
variate Gaussian distribution in these transformed vari- 
ables, 

P(Q|d)=fn|§l ^Wd). (7) 



Here is the Jacobian of the transformation, and x = 
{xi\ is a Gaussian random vector with mean /i = {/^f} 
and covariance matrix Cw = {{xi — iJi){xe' — He')) ■ 
Thus, the approximation of our likelihood estimator re- 
lies on the assumption that 



P(x|d) 



,-i(x-M)^C-i(x-p) 



(8) 



Note that this is by construction exact for the full-sky 
uniform noise case, because the covariance matrix in this 
case is diagonal, and the full expression factorizes in i] in 
that case we are only performing an identity operation. 

3.1. Transformation to Gaussian marginal variables 

The first step in our algorithm is to compute the 
change-of-variables rule from Ce to xe that transforms 
the marginal distribution, P{Ci\d), for each i into a 
Gaussian distribution, P{xe\d). The data used for this 
process are the ae samples drawn from the joint posterior 
P(Cf, s|d) by the CMB Gibbs sampler. 

We use two different methods of estimating the 
marginal distributions from these samples. The first ap- 
proach is to estimate P(C^ |d) with the Blackwell-Rao 
estimator as defined by Equation [5l over a grid in Ce 
for each £. Then, a cubic spline is fitted to the result- 
ing distribution. This is the preferred approach for high 
signal-to-noise or \ow-£ modes. 

However, for low signal-to-noise and high-^ modes one 
observes similarly poor convergence properties of this 



o 
o 

• ^ 

N 
Id 



o 






1 2 3 1 2 2 4 

3 2 

Power spectrum, /(/+l)/27i (10 |iK ) 

Fig. 1. — Slices through the joint UkeUhood from a low-resolution simulation, computed by brute-force pixel space evaluation (black 
lines) and the Gaussianized Blackwell-Rao estimator (red lines). Green lines show the marginal distribution for each £, to illustrate the 
effect of mode coupling caused by the WMAP KQ85 sky cut used in this analysis. 



marginal estimator as for the full joint estimator. In 
these cases we therefore instead compute a simple his- 
togram direct ly from the Cj samples, and fit a smooth 
spline ( Green fc Silvermanll9 94] through this histogram. 
For further stability, we also produce 0(10^) samples 
from P{C£\a^) based on the same set as used for the 
BR estimator. This essentially corresponds to comput- 
ing the Blackwell-Rao estimator by Monte Carlo, and the 
computational cost of producing these extra samples is 
small. (The computational expense of the Gibbs sampler 
is driven by samphng from P(s|C^,d), not by P{Ce\s).) 
Note that this a pproach naturally suppo rts arbitrary d 
binning schemes (|Eriksen fc Wehuj|2008f ). and also inter- 
fac es naturally with th e hybrid MCMC scheme described 
bv lJewelletaJ] (|2004D . 

Given these spline approximations to P{Ci\d) for each 
£, we compute the corresponding cumulative distribu- 
tions by numerical integration. 



Xi{Ci), again stored in the form of cubic splines, that 
allows for very efficient transformation from standard to 
Gaussian variables for arbitrary values of Ci. From these 
splines, it is also easy to compute the derivatives required 
for the Jacobian in Equation [71 



This is subsequently identified with a standard Gaussian 
distribution with zero mean and unity variance. Explic- 
itly, we find xi{Ce) over a grid in Ci such that 



F{Ci\d) = Fc^^ssixi) 



1 



erf ( 



where erf is the error function. This equation is straight- 
forward to solve using standard numerical root-finding 
routines. The result is a convenient set of look-up tables 



3.2. Estimation of the joint Gaussian density 

Having defined a change-of-variables for each i, the re- 
maining task is to estimate the joint distribution, P(x|d), 
in the new variables. In this paper, we approximate 
this distribution by a joint Gaussian, but any parametric 
function could of course serve this purpose. For example, 
we implen iented support for the skew- Gaussian distribu- 
tion (e.g.. lAzzalini &: Capitanid 120031 ) in our codes, but 
found that the improvement over a simple Gaussian was 
very small. 

The only free parameters in this multivariate Gaus- 
sian distribution are the mean, fj,, and the covariance, C. 
These are again estimated from the samples produced 
by the Gibbs sampler. First, we draw N ^ O{10^) Ce 
samples from P{Ce\cr£), as described above, but this time 
including all ^'s for each sample. Then we Gaussianize 
these £-hy-£, by evaluating xe{Ce) for each sample and 
multipole moment. Finally, we compute the correspond- 
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Fig. 2. — Comparison of amplitude — tilt likelihoods derived from 
the 5-year WMAP GBR estimator up to f < 250, for two indepen- 
dent sample sets (solid and dashed lines). Contours are where 
—2lnC{Ci) rises by 0.1, 2.3, 6.17 and 11.8 from the minimum, cor- 
responding to the peak and 1, 2 and 3cr confidence regions in a 
Gaussian distribution. See main text for full details. The cross 
marks the point {q, n) = (1, 0), corresponding to the best-fit model 
for WMAP including all ^'s, and this is found to lie well inside the 
la contour. 
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where the sums run over sample index. 

4. APPLICATION TO SIMULATED DATA 

Before applying the machinery described in the pre- 
vious section to the 5-year WMAP data, we verify the 
method by a analyzing a simulated low-resolution data 
set. The reason for considering a low-resolution simula- 
tion is that only in this case is it possible to evaluate the 
exact likelihood by brute force in pixel space, without 
making any approximations. 

The simulation is made by drawing a Gaussian real- 
ization fro m the best-fit 5-year WMAP ACDM power 
spectrum (jKomatsu et al.|[2008[) . smoothing this with a 
10° FWHM Gaussian beam, and projecting it on an 
Nside = 16 HEALPix^ grid. Finally, 20^iK RMS white 
noise is added t o each pixe l, and the (degraded) WMAP 
KQ85 sky cut ()Gold et all [2008) is apphed to the data. 
The maximum multipole considered in this analysis was 
^max = 47, and the spectrum was binned with a bin size 
of A£ = 5 from £ = 20. The signal-to-noise is unity at 
£ = 19, and negligible beyond £ > 30. 

We now compute slices for each £ through the full 
multivariate likelihood, both with the method described 
in j}3l and by brute - force pixel space evaluation (e.g., 
lEriksen et al.l [2007a| ). fixing all other ts at the input 
AGDM spectrum. For comparison, we also compute the 
the marginal distributions for each £. 



The results from this exercise are shown in Figure [TJ 
Black lines indicate the brute-force likelihoods, and red 
lines show the Gaussianized Blackwell-Rao likelihoods. 
The green lines show the marginal distributions, visual- 
izing the effect of mode coupling due to the sky cut. 

First, we see that all distributions agree very closely at 
^ < 8. In this very large-scale regime, all harmonic modes 
are sufficiently well sampled with the KQ85 sky cut that 
mode coupling is negligible. However, from ^ > 10 the 
marginal distributions are noticeably different from the 
likelihood slices, with a typical shift in peak position of 
^ lOO^K^'s. We also see that these correlations are ac- 
curately captured by the Gaussian approximation imple- 
mented in the GBR estimator, as the GBR likelihoods 
are essentially identical to the brute-force slices up to 
£^ 18. 

At the very high £ and low signal-to-noise end, we 
see slight differences between the GBR and the pixel 
space slices, and in fact, the agreement is better with the 
marginal distributions. This is caused by poor conver- 
gence of the covariance matrix in this particular run, and 
is included here for pedagogical purposes only: In a real 
analysis, one must always make sure that all distribu- 
tions have converged well, typically by analyzing differ- 
ent chain sets separately. Note also that with sufficiently 
wide bins, the correlations to neighboring bins eventually 
vanish, and in this case it may be better to remove these 
correlations by hand from the covariance matrix, rather 
than trying to estimate them by sampling. Whether this 
is the case or not for a given set can again be estimated 
by jack-knife tests. Finally, for the 5-year WMAP anal- 
ysis presented in this paper, we will only use the GBR 
estimator in the high signal-to-noise regime, and in that 
case the distributions converge very quickly. 

5. 5- YEAR WMAP TEMPERATURE ANALYSIS 

We now apply the tools described in ^to the 5-year 
WMAP temperature data. We only consider £ < 200 
in this paper, to avoid issues with error propagation for 
unresolved point sources and beam estimation. However, 
we do correct for the mean spectrum of unresolved point 
sources, as described below. 

5.1. Data 

We analyze the foreground reduced 5-year WMAP V- 
band temperature sky maps, which are available from 
Lambda^. The V-band data was chosen because these 
are considered to be the cleanes t in terms of foreg rounds 
out of the five WMAP bands (|Gold et al.ll2008[ i. Fur- 
ther, at £ < 200 the V-band alone is strongly cosmic 
variance dominated, and one does therefore not gain 
any significant statistical power by co-adding with other 
bands. Instead, one only increases the chance of intro- 
ducing foreground biases by adding more frequencies. 
We work with the individual differencing assembly (DA) 
maps (Hinshaw et al. 2003), and take into account the 
beam and noise pattern for each map separately. 

The WMAP sky maps are pixelized at a HEALPix res- 
olution of iVsidc = 512, corresponding to a pixel size of 7', 
and the instrumental beam of the two V-band channels 
has a FWHM of 21'. We therefore impose an upper har- 
monic mode limit of ^max ~ 700 in the Gibbs sampling 
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Fig. 3. — Number of Ct samples required for the GBR covariance 
matrix to converge, as a function of ^max- 
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Fig. 4. — GBR correlation matrix, Cui, for the 5-year WMAP 
V-band data. 

(Commander) step, probing deeply into the noise domi- 
nated regime. Note, however, that we only use i < 200 in 
the GBR estimator, to avoid high-^ complications, such 
as beam and point source error propagation, in the cos- 
mological parameter stage. 

We correct the spectrum for unresolved point sources 
using the WMAP model. Explicitly, the mean spectrum 
due to unresolved point sources in a sing le frequency, 
for the 5-year WMAP data is modelled as (jHinshaw et al.l 
[20031 [20071 : iNolta et al.l[200il 
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where Apg = 0.011 ± 0.001 is the point source amplitude 
relative to the Q-band channel {i^o = 41GHz), /3 = —2.1 
is the best-fit spectral index of the point sources, and 
a{iy) is the conversion factor between antenna and ther- 
modynamic temperature units. To correct for this in our 
analysis, we subtract C^'^ , evaluated at v — 61GHz, from 
each ai sample before computing the GBR estimator. 

Finally, we im pose the WMAP KQ85 sky cut 
([Gold et al.ll2008D on the data that masks point sources, 
removing 18% of the sky. Note that we adopt the tem- 
plate corrected maps provided by the WMAP team in 
this analysis, and postpone an internal Gibbs sampling 
based foreground analysis to a future paper; for now our 
main focus is the new likelihood approximation, not the 
impact of foregrounds. 

5.2. Analysis overview 
The analysis consists of the following steps: 
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Fig. 5. — Comparison of 5-year power spectra obtained by 
WMAP and Commander/GBR up to £ = 200. The bottom panel 
shows the difference of these two spectra, and the gray band indi- 
cates the 68% confidence region due to noise and sky cut only, not 
cosmic variance. 

1. Generate 4000 ai samples with Commander from 
the 5-year VI and V2 differencing assemblies , in- 
cluding ^'s up to irnax = 700, divided over 8 chains. 

2. Generate 500000 Ce samples from these cr^'s, in- 
cluding ^'s between £ — 2 and 250. 

3. Compute the corresponding GBR parameters, i.e., 
transformation tables, means /i and covariance ma- 
trix C. 

4. Modify the 5-year WMAP temperature likelihood 
by replacing the existing low-^ part with Equation 
[71 with the parameters given in (3). The transition 
multipole between the low-£ and high-£ is increased 
from £ = 32 to 200. Muhipoles between = 201 
and 250 are included in the GBR estimator to avoid 
truncation effects, but the spectrum in this range 
is kept fixed at a fiducial spectrum, in order not to 
count these multipoles twice. 

5. Cosmol ogical parameters are estimated using Cos- 
moMC (iLewis fc Bridld[200l . 



5.3. Convergence analysis 

Before presenting the results from the WMAP analy- 
sis, we consider the question of convergence. First, we 
comp ute the Gelman- Rubin statistic (Gclman fc Rubiiil 
119921 ) for each ai using the eight chains computed with 
Commander and removing the first 20 samples for burn- 
in. We find that i? - 1 is less than 0.01 for i < 300 and 
less than 1.1 for £ < 500, indicating very good conver- 
gence in terms of power spectra. 

However, the fact that each ai individually is well con- 
verged does not automatically imply that the full likeli- 
hood is well converged, since the latter depends crucially 
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Fig. 6. — Comparison of likelihood slices from the standard WMAP likelihood code (dashed lines) and the new GBR likelihood (solid 
lines). 



on the correlations between ais. To assess the conver- 
gence in terms of cosmological parameters, we therefore 
analyse a toy model, by fitting a simple two-parameter 
amplitude and tilt, q and n, model. 



'fid 



(12) 



^ -^pivot ^ 

to the WMAP data between 1 = 2 and 250 with the 
GBR likehhood. Here Cf"^ is a fiducial power spectrum, 
which is chosen t o be the best-fit 5-ye ar WMAP ACDM 
power spectrum (jKomatsu et al.|[2i308D . and fpivot — 150. 
We then map out the likelihood in a grid over q and 
n. This is repeated twice, first including samples from 
chains number 1 to 4 and then from chains number 5 to 
8. 

The results from this exercise are shown in Figure [3 in 
terms of two sets of likelihood contours, corresponding 
to each of the two chain sets, respectively. The agree- 
ment between the two is excellent, indicating that we 
also have good convergence in terms of cosmological pa- 
rameters with the existing sample set. Note also that 
the point {q,n) = (1,0) lies well inside the lu confidence 
region, indicating that the best-fit WMAP model, which 
is obtained including ^'s between £ = 2 and 1024, also is 
a good fit to ^ = 2 to 250 separately. 

Third, as described in ^ we construct the GBR covari- 
ance matrix from N = 0(10^) Ci samples drawn from 
the (smaller) set of samples. An outstanding question 
is how large N should be in order for this covariancc ma- 
trix to reach convergence, as a function of ^max- To settle 
this question, we carry out the following simple exercise: 



First we produce two Ci sample sets, each containing N 
samples, and all drawn from a single ai sample. Second, 
we compute the two corresponding covariancc matrices, 
invert these, then subtract them from each other, and 
finally compute the standard deviation of all elements. 
Third, we define the inverse covariancc matrix to be con- 
verged if the RMS MCMC noise is less than 0.005, cor- 
responding to 0.5% of the diagonal elements. (We have 
checked that this produces robust parameter estimates.) 
We then find the smallest N such that this is satisfied, 
as a function of ^max- 

The results from this exercise are shown in Figure [H 
Here we see that the number of samples required for con- 
vergence rises rapidly up to £ ~ 30, reaching a maximum 
of ~ 8 X 10* samples, and then flattens to a plateau. To 
be on the safe side, we therefore always use either 5 x 10'°^ 
or 10^ samples in the WMAP analysis. 

The reason for this behaviour becomes intuitive when 
considering the structure of the actual matrix. This is 
shown in Figure [H on the form of a correlation matrix 



Ceei = 



Ct 



Hi' 



(13) 



The main features of this matrix are negative correla- 
tions around the diagonal, with the largest amplitudes 
observed between £ and £±2. This is expected: First, 
two modes separated by = 1 have different parity, and 
can therefore not easily mimic each other. On the other 
hand, modes separated by = 2 have both identical 
parity and similar angular scale, and it is therefore pos- 
sible to add power to one mode and subtract it from the 
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other, and still maintain an essentially unchanged image. 
The result is a noticeable anti-correlation between £ and 
£±2. 

At larger separations in ^, the correlations die off 
rapidly, since it is difficult for a large-scale mode to mimic 
a small-scale mode with a reasonably small sky cut. And 
this explains the convergence behaviour seen in Figure [31 
The covariance matrix is strongly band-limited. There- 
fore, once one has a sufficiently large number of sam- 
ples for a sub-block to converge, there is also enough 
samples for a sub-block further away to converge. These 
are essentially uncorrelated. 

5.4. Results 

We now present the main results derived from the 5- 
year WMAP temperature data with the GBR estimator 
between i = 2 and 200. First, in the top panel of FigureO 
we plot the power spectrum obtained by maximizing the 
GBR likelihood together with the ofhcial 5-year WMAP 
power spectrum. The bottom panel shows the difference 
between these two, and the gray band indicates the stan- 
dard deviation of ai, i.e., the uncertainty due to noise 
and sky cut, but not to cosmic variance. Clearly, the 
agreement between the two power spectra is very good. 

Next, in Figure [H] we compare a few selected slices 
through the GBR likelihood with slices through the 
WMAP likelihood. All other £'s than the one currently 
considered are kept fixed at the best-fit ACDM spectrum. 
Here we see that there are small shifts in peak positions, 
corresponding to the small differences seen in the power 
spectra in FigureO However, a main point in this plot is 
that the GBR likelihood slices are well behaved even at 
the highest ^'s, and this is not the case for the standard 
BR estimator (jChu et al.ll2005f ). 

Finally, in Table [T] and Figure [7| we summarize the 
marginal cosmological parameter posteriors obtained 
with the two likelihood codes from CosmoMC. Interest- 
ingly, there are some notable differences at the 0.3-0.6tT 
level, with the most striking example being the spectral 
index of scalar perturbations, Us = 0.973 ± 0.014. This 
is only 1.9a away from unity, and 0.6a higher than the 
official WMAP values. 

6. CONCLUSIONS 

We have presented a new likelihood approximation to 
be used within the CMB Gibbs sampling framework. 
This approximation is defined by Gaussianizing the ob- 
served marginal power spectrum posteriors, P{Ci\d), 
through a specific change-of-variables, and then cou- 
pling these univariate posteriors into a joint distribution 
through a multivariate Gaussian in the new variables. 
This process is exact, i.e., an identity operation, in the 
uniform and full-sky coverage case, and it is also an excel- 
lent approximation in for the moderate sky cuts relevant 
to satellite missions such as WMAP and Planck. 

Our new approach relies on the p reviously described 
CMB Gibbs samp l ing framework (iJewell et al.l l2004l : 
IWandelt et al.ll2004l : [Er'iksen et al.ll2004( ). andthCTebv in- 
herits many important advantages from that. First and 
foremost, this framework allows for seamless propaga- 
tion of uncertainties from various systematic effects (e.g., 
foregrounds, beam uncertainties, calibration or noise es- 
timation errors) to the final cosmological parameters. 
This is not straightforward in the hybrid scheme used 



by the WMAP code. Second, this new approach corre- 
sponds to the exact low-^ pixel space likelihood part of 
the WMAP code, not the approximate high-£ MASTER 
part. Still, our method can handle arbitrary high ^'s. 
Third, once the one-time pre-processing step has been 
completed, the computational expense of our estimator 
is determined by the cost of ^max spline evaluations, while 
a pixel space approach requires a matrix inversion, and 
therefore scales as 0{N^-^). For the cases considered in 
this paper, the CPU time required for the GBR WMAP 
estimator up to £ = 200 was ^ 0.2 milliseconds, while 
it was ~ 20 seconds for the pixel space approach up to 
£ = 32, for a map with 2500 pixels. 

In order to validate our estimator, we applied it to a 
low-resolution simulated data set, and compared it to 
slices through the exact joint likelihood as computed by 
brute-force evaluation in pixel space. The agreement be- 
tween the two approaches was excellent. We then ap- 
plied the same estimator to the 5-year WMAP temper- 
ature data, and estimated both a new power spectrum 
and new cosmological parameters within a standard six- 
parameter ACDM model. 

The results from these calculations are interesting. 
First, our power spectrum is statistically very similar 
to the official WMAP spectrum, with no visible biases 
seen and relative fluctuations within the level predicted 
by noise and sky cut. Nevertheless, we do find significant 
differences in terms of cosmological parameters, and most 
notably in the spectral index of scalar perturbations, Ug. 
Specifically, we find Us = 0.973 ± 0.014, which is only 
1.9a away from unity and 0.6a higher than the official 
WMAP result, = 0.965 ± 0.014. 

This result resembles very much the outcome of a re- 
analy sis we did with the 3-year WMAP temperature 
data (jEriksen et al.l [200731 ). for which we found a bias 
of 0.4(7 in rig compared to the official WMAP results. 
This bias was due to the sub-optimal MASTER-based 
likelihood approximation ([Hivon et al.ll2002l : IVerde et al.l 
I2003f) used by the WMAP team between i = 12 and 30, 
whereas we used an exact estimator in the same range. 
This study later prompted the WMAP to change their 
codes to use an exact likelihood evaluator up to ^ = 30. 

In the same study, we also tried to increase the £-range 
for our exact estimator to i = 50, but found small differ- 
ences. We therefore concluded, perhaps somewhat pre- 
maturely, that an exact estimator up to ^ = 30 was suf- 
ficient for obtaining accurate results. On the contrary, 
in this paper we find still find significant changes when 
increasing the exact estimator up to £ = 200. 

In retrospect, this should perhaps not come as a com- 
plete surprise, when realizing that the impact on a partic- 
ular cosmological paramete r typically depends logarith - 
mically on £. For instance, iHamimeche fc Lewis! ( 20081 ) 
considered a simple power spectrum model with a sin- 
gle free amplitude, Ci = qCf"^, and found that, for a 
given likelihood estimator to be "statistically unbiased" , 
the systematic errors in that same estimator must fall off 
faster than ^ l/£. 

A similar consideration holds for Ug. Intuitively, Ug is 
as much affected by £ = 2 to 10 as it is between £ = 
20 and 100. In the previous 3-year WMAP re-analysis 
paper, we increased the range of the accurate likelihood 
estimator from £ = 12 to 30, corresponding to a factor of 
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Fig. 7. — Comparison of marginal cosmological parameter posteriors obtained with tlie standard WMAP likelihood code (dashed linos) 
and with the modified GBR likelihood code up to = 200 (solid line) for 5-year WMAP data. 



TABLE 1 

Marginal 5-year WMAP cosmological parameters 



Parameter 


WMAP 


GBR 


Shift in (T 




0.0228 ± 0.0006 


0.0230 ± 0.0006 


0.4 




0.109 ±0.006 


0.0108 ± 0.006 


-0.3 


log(lOiOAs) 


3.06 ± 0.04 


3.06 ± 0.04 


0.0 


h 


0.722 ±0.03 


0.732 ± 0.03 


0.3 




0.965 ±0.014 


0.973 ±0.014 


0.6 


T 


0.090 ±0.02 


0.090 ±0.02 


0.0 



Note. — Comparison of cosmological parameters obtained 
with the standard 5-year WMAP likelihood code (second col- 
umn) and with the new GBR estimator at £ < 200 (third col- 
umn), given in terms of marginal means and standard devia- 
tions. The shift between the two in units of cr is listed in the 
fourth column. 

2.5 in £^ and removed a bias of ~ OAa in Ug. In this paper, 
we increase the range from £ = 30 to 200, corresponding 
to a factor of 6.7 in i, and find an additional bias of 0.6cr. 
However, increasing £ from 30 to 50 corresponds only to 
a factor of 1.7 in i, and this appears to be too small to 
produce a statistically significant result. 

The main conclusions from this work are two-fold. 
First, it seems that an accurate likelihood description 
is required to higher £'s than previously believed, and at 
least up to ^ = 200, in order to obtain unbiased results. 
By extrapolation, it also does not seem unlikely that even 
higher multipoles should be included. This issue will be 
revisited in a future publication. 

Our second main conclusion is that we find a spec- 
tral index only 1.9a away from unity, namely Ug — 
0.973 ± 0.014. To us, it therefore seem premature to 
make strong claims concerning ^ 1', the statistical 



significance of this is rather low, and there are likely still 
unknown systematic errors in this number. 

In a future publication we will generalize the GBR esti- 
mator to polarization. Once completed, this will enable a 
fully Gibbs-based CMB likelihood analysis at low fs, and 
remove the need for likelihood techniques based on ma- 
trix operations, i.e., inversion and determinant evalua- 
tion. The computational cost of a standard cosmological 
parameter MCMC analysis (e.g., CosmoMC) will then 
once again be driven by the required Boltzmann codes 
(e.g., CAMB or CMBFast) and not by the hkelihood eval- 
uation. In turn, this will increas e the importance of fast 
interpolation codes such as Pic o (iFendt fc Wandclt 20071) 
or COSMONET (Aul d erani2007f ). With such fast al- 
gorithms for both spectrum and likelihood evaluations 
ready at hand, the CPU requirements for cosmological 
parameter estimation may possibly be reduced by orders 
of magnitude. 
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